MINIMIZATION OF ENTROPY FUNCTIONALS 



CHRISTIAN LEONARD 



Abstract. Entropy functionals (i.e. convex integral functionals) and extensions of these 
functionals are minimized on convex sets. This paper is aimed at reducing as much as 
possible the assumptions on the constraint set. Dual equalities and characterizations of 
the minimizers are obtained with weak constraint qualifications. 
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1. Introduction 

1.1. The entropy minimization problem. Let i? be a positive measure on a space Z. 
Take a [0, oo]-valued measurable function 7* on 2^ x M such that 'y*{z, ■) := 7* is convex 
and lower semicontinuous for all z E Z. Denote Mz the space of all signed measures Q 
on Z. The entropy functional to be considered is defined by 

+00 otherwise 

where Q ^ R means that Q is absolutely continuous with respect to R. Assume that for 
each z there exists a unique m{z) which minimizes 7* with 

i:{m{z)) = 0,\/ze Z. (1.2) 

Then, I is [0, oo]-valued, its unique minimizer is mR and I{mR) = 0. 
This paper is concerned with the minimization problem 

minimize I{Q) subject to T^Q G C, Q E Mz (1.3) 

where To : Mz — > A'o is a linear operator which takes its values in a vector space Xo and 
C is a convex subset of Xq. 



Date: October 07. 

2000 Mathematics Subject Classification. 46E30, 46NI0, 49K22, 49N15, 49N45. 

Key words and phrases, entropy, convex optimization, constraint qualification, convex conjugate, Or- 
licz spaces. 



2 



CHRISTIAN LEONARD 



1.2. Presentation of the results. Our aim is to reduce as much as possible the restric- 
tions on the convex set C. Denoting ttie minimizer Q of fll.3p . the geometric picture is 
that some level set of / is tangent at Q to the constraint set T~^C. Since these sets are 
convex, they are separated by some affine hyperplane and the analytic description of this 
separation yields the characterization of Q. Of course Hahn-Banach theorem is the key. 
Standard approaches require C to be open with respect to some given topology in order 
to be allowed to apply it. In the present paper, one chooses to use a topological structure 
which is designed for the level sets of / to "look like" open sets, so that Hahn-Banach 
theorem can be applied without assuming to much on C. 

This strategy is implemented in [17] in an abstract setting suitable for several applica- 
tions. It is a refinement of the standard saddle-point method [22] where convex conjugates 
play an important role. The proofs of the present article are applications of the general 
results of [T7] . 

Clearly, for the problem fll.3p to be attained, T^^C must share a supporting hyperplane 
with some level set of /. This is the reason why it is assumed to be closed with respect 
to the above mentioned topological structure. This will be the only restriction to be kept 
together with the interior specification (11.41) below. 

Dual equalities and primal attainment are obtained under the weakest possible assump- 
tion: 

CnTodomI ^ 

where domJ := {Q G Mz;I{Q) < oo} is the effective domain of / and TodomI is its 
image by To. The main result of this article is the characterization of the minimizer s of 
( II. 3p in the interior case which is specified by 

Cnicor(Todom/) ^ (1.4) 

where icor (TodomJ) is the intrinsic core of TodomI. The notion of intrinsic core does not 
rely on any topology; it gives the largest possible interior set. For comparison, a usual 
form of constraint qualification required for the representation of the minimizers of (11.31) 
is 

int(C)n TodomI ^0 (1.5) 

where int (C) is the interior of C with respect to a topology which is not directly connected 
to the "geometry" of /. In particular, int (C) must be nonempty; this is an important 
restriction. The constraint qualification (11.40 is weaker. 

An extension of Problem (11.31) is also investigated. One considers an extension I of the 
entropy / to a vector space Lz which contains Mz and may also contain singular linear 
forms which are not cr-additive. The extended problem is 

minimize subject to Tod G C, ^ G Lz (1-6) 

Even if I is strictly convex, I isn't strictly convex in general so that (II. 6p may admit 
several minimizers. There are situations where (11.30 is not attained in Mz while (II. 6p is 
attained in Lz. Other relations between these minimization problems are investigated by 
the author in [IB] with probabilistic questions in mind. 

1.3. Literature about entropy minimization. Entropy minimization problems ap- 
pear in many areas of applied mathematics and sciences. The literature about the min- 
imization of entropy functionals under convex constraints is considerable: many papers 
are concerned with an engineering approach, working on the implementation of numerical 
procedures in specific situations. In fact, entropy minimization is a popular method to 
solve ill-posed inverse problems. 
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Rigorous general results on this topic are quite recent. Let us cite, among others, the 
main contribution of Borwein and Lewis: [T], [2], [5], [1], [5], [H] together with the paper 
[23] by TebouUe and Vajda. In these papers, topological constraint qualifications of the 
type of (11 .Sp are required. Such restrictions are removed here. 

With a geometric point of view, Csiszar [8], [9j provides a complete treatment of (11.31) with 

the relative entropy (see Section [6TT!) under the weak assumption (II. 4p . The behavior of 

minimizing sequences of general entropy functionals is studied in [TOj. 

By means of a method different from the saddle-point approach, the author has already 

studied in p!5| [TBJ entropy minimization problems under affine constraints (corresponding 

to C reduced to a single point) and more restrictive assumptions on 7*. 

The present article extends these results. 

Outline of the paper. The minimization problems (II. 3p and (ll.6p are described in 
details at Section [21 In Section [31 the main results of [I7j about the extended saddle-point 
method are recalled. Section [H is devoted to the extended problem (II. 6p and Section [S] to 
(II. 3p . One presents important examples of entropies and constraints at Section [6TT1 

Notation. Let X and Y be topological vector spaces. The algebraic dual space of X is 
X*, the topological dual space of X is X' . The topology of X weakened by Y is cr(X, Y) 
and one writes (X, Y) to specify that X and Y are in separating duality. 
Let / : X — >■ [—00, +00] be an extended numerical function. Its convex conjugate with 
respect to {X,Y) is f*{y) = sup^^xii^^y) ~ fi^)} ^ [~oo, +00], y eY. Its sub differential 
at X with respect to {X,Y) is dyfix) = {y e F; /(x + > f{x) + {y,OM e X}. If no 
confusion occurs, one writes df{x). 

The intrinsic core of a subset A of a vector space is icorA = {x G y4;V2;' G aS A,3t > 
0, [x,x + t{x' — x)[g A} where aff A is the affine space spanned by A. icordom/ is the 
intrisic core of the effective domain of / : dom/ = {x G X; f{x) < 00}. 
The indicator of a subset A of X is defined by 

_ f 0, a X e A X e X 
''A{x) I +00, otherwise ' ^ 



2. Presentation of the minimization problems (Pc) and (Pc) 

The problem (II. 3p and its extension (ll.6p are introduced. Their correct mathematical 
statements necessitate the notion of Orlicz spaces. The definitions of good and bad 
constraints are given and the main assumptions are collected at the end of this section. 

2.1. Orlicz spaces. To state the minimization problem (ll.3p and its extension correctly, 
one will need to talk in terms of Orlicz spaces related to the function 7*. 
Let us recall some basic definitions and results. A set Z is furnished with a a-finite 
nonnegative measure i? on a cr-field which is assumed to be i?-complete. A function 
p : 2 X M is said to be a Young function if for i?-almost every z, p{z, ■) is a convex even 
[0, oo]-valued function on M such that p{z, 0) = and there exists a measurable function 
z \—>- Sz > such that < p{z, Sz) < 00. 

In the sequel, every numerical function on Z is supposed to be measurable. 




ip{u) dR and / = for short, instead of 
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Definitions 2.1 (The Orhcz spaces Cp,£p, Lp and Ep). The Orhcz space associated with 
p is defined by Cp{Z, R) = {u : Z ^ M; \\u\\p < +00} where the Luxemburg norm \\ ■ \\p is 
defined by \\u\\p = inf {/5 > ; p{z, u{z) / P) R{dz) < l} . Hence, 

Cp{Z, i?) = |m : Z ^ M ; 3ao > 0, y p(^z, aou{z)^ R{dz) < ooj- . 

A subspace of interest is 

£p{Z,R) = |u : Z ^ R ;Va > p(^z,au{z)^ R{dz) < ooj . 

Now, let us identify the R-a.e. equal functions. The corresponding spaces of equivalence 
classes are denoted Lp{Z, R) and Ep{Z, R). 

Of course Ep C Lp. Note that if p doesn't depend on z and p(so) = 00 for some So > 0, 
Ep reduces to the null space and if in addition R is bounded, Lp is Loo- On the other 
hand, if p is a finite function which doesn't depend on z and R is bounded, Ep contains 
all the bounded functions. 

Duality in Orlicz spaces is intimately linked with the convex conjugacy. The convex 
conjugate p* of p is defined by p*{z, t) = supg^^{st — p{z, s)}. It is also a Young function 
so that one may consider the Orlicz space Lp*. 

Theorem 2.2 (Representation oi E'^). Suppose that p is a finite Young function. Then, 
the dual space of Ep is isomorphic to Lp* . 

Proof. For a proof of this result, see ([12], Thm 4.8). □ 

A continuous linear form £ G L^ is said to be singular if for all u & Lp, there exists a 
decreasing sequence of measurable sets (A„) such that -R(n„A„) = and for all n > 1, 
{l,ulz\An) = 0. Let us denote L^ the subspace of L'^ of all singular forms. 

Theorem 2.3 (Representation of L'^). Let p be any Young function. The dual space of 
Lp is isomorphic to the direct sum L^ = (Lp. ■ R) © L^. This implies that any £ E L'^ is 
uniquely decomposed as 

i = t + t (2.4) 

with r e Lp* -R and f G L^. 

Proof. When Lp = Loo this result is the usual representation of L'^. 
When p is a finite function, this result is ([13], Theorem 2.2). 

The general result is proved in [19], with p not depending on z but the extension to a 
^-dependent p is obvious. □ 

In the decomposition (12. 4p . is called the absolutely continuous part of i while £^ is 
its singular part. 

Proposition 2.5. Let us assume that p is finite. Then, i & L'^ is singular if and only if 
{£, u) = 0, for all u in Ep. 

Proof. This result is ([I3], Proposition 2.1). □ 

The function p is said to satisfy the A2-condition if 

there exist C > 0, So > such that Vs > So, p(2s) < C p{s) (2.6) 

If So = 0, the A2-condition is said to be global. When R is bounded, in order that Ep = Lp, 
it is enough that p satisfies the A2-condition. When R is unbounded, this equality still 
holds if the A2-condition is global. Consequently, if p satisfies the A2-condition we have 
Lp = Lp* ■ R so that L^ reduces to the null vector space. 
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2.2. The minimization problem (Pc)- Before introducing an extended minimization 
problem, let us state properly the basic problem (11.31) . 



Relevant Orlicz spaces. Since 7* is closed convex for each z, it is the convex conjugate of 
some closed convex function 7^. Defining 

X{z, s) = 7(z, s) — m{z)s, z E Z, s E M. 

where m satisfies (11.21) . one sees that for R-a.e. z, is a nonnegative convex function 
and it vanishes at 0. Hence, 

Xo{z, s) = ma.x[X{z, s), X{z, —s)] G [0, 00], z E Z, s E M. 

is a Young function. We shall use Orlicz spaces associated with and A*. 

We denote the space of i?-absolutely continuous signed measures having a density in the 

Orlicz space Lx* by Lx*R. The effective domain of I is included in mR + Lx*R. 



Constraint. In order to define the constraint, take a vector space and a function 
6 : Z ^ Xo- One wants to give a meaning to the formal constraint J^6dQ = x with 
Q G Lx^R and x E Xo- Suppose that Xo is the algebraic dual space of some vector space 
3^0 and define for all y E 

T:yiz):={y,eiz))y^,;,^, z E Z. (2.7) 

Assuming that 

T:yo C Cx., (2.8) 

Holder's inequality in Orlicz spaces allows to define the constraint operator Tgi := J^^ di 
for each i E Lx^R by 



y, I edi) = {y,eiz))y^,;,Jidz), yyEyo. (2.9) 
•2 / yo,x, Jz 



Minimization problem. Consider the minimization problem 

minimize I{Q) subject to J d d{Q — mR) E Co, Q E mR + LxiR (Pco) 

where Co is a convex subset of Xo- One sees with 7*(t) = X*{t — m{z)) that I^iQ) = 
I\*{.Q — mR). Therefore, the problem ([Pg^p is equivalent to 



minimize I x'ii) subject to jodlECo, i E Lx^R (2.10) 

with I = Q — mR. If the function m satisfies m E Lx*, one sees with (12.81) and Holder's 
inequality in Orlicz spaces that the vector Xo = J2; Om dR E Xo is well-defined in the weak 



sense. Therefore, (Pc^) is 



minimize I{Q) subject to j 9 dQ EC, Q E Lx^R (Pc) 



with C = Xo + Co 
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2.3. The extended minimization problem (Pc)- If the Young function Ao doesn't 
satisfy the A2-condition (12.61) . for instance if it has an exponential growth at infinity as 
in (16.11) or even worse as in (16.31) . the small Orlicz space £x^ may be a proper subset of 
£ao- Consequently, for some functions 9, the integrability property 

T:yo c (2.11) 

or equivalently 

Wyeyo,j^X{{y,9))dR<oo (A^) 
may not be satisfied while the weaker property (12. 8p : T*3^o C Cx^, or equivalently 

Eyo,3a>0, j \{a{y,9))dR<oo (A, 
holds. In this situation, analytical complications occur (see Section H]). This is the reason 



3\ 



why constrain ts s atisfying (A^) are called good constraints, while constraints satisfying 
(A^) but not (A^) are called bad constraints. 

If the constraint is bad, it may happen that (Pc) is not attained in L\*R. This is the 
reason why it is worth introducing its extension (Pc) which may admit minimizers and 
is defined by 

minimize subject to {9,i) eC, ie L'^^ (Pc) 

where L'^^ is the topological dual space of Lx^, I and {9,i) are defined below. 
The dual space L'^^ admits the representation L'^^ ~ Lx^R © L\^. This means that any 
i G L'^^^ is uniquely decomposed as £ = i"" + where G Lx*R and £^ G are 
respectively the absolutely continuous part and the singular part of i, see Theorem 12. 3[ 
The extension / has the following form 

m = Hn+'^:omi,{n, ^^l',^ (2.12) 

It will be shown that / is the greatest convex a{L'^^, Lx^)-loweT semicontinuous extension of 
/ to D Lxi- In a similar way to (12. 9p . the assumption (Ag) allows to define Toi = (9, i) 
for all £ G L'^^ by 

Important examples of entropies with not satisfying the A2-condition are the usual 
(Boltzmann) entropy and its variants, see Section 16.11 and (16.11) in particular. 
When Ao satisfies the A2-condition (12. 6p . (Pc) is (Pc)- 

2.4. Assumptions. Let us collect the assumptions on R, 7* and 9. 
Assumptions (A). 

(A/j) It is assumed that the reference measure i? is a a-finite nonnegative measure on a 

space Z endowed with some i?-complete a-field. 
(A^*) Assumptions on 7*. 

(1) 7*(-,t) is 2;-measurable for all t and for i?-almost every z E Z, 7*(z, ■) is 
a lower semicontinuous strictly convex [0, +cxD]-valued function on R which 
attains its (unique) minimum at m,{z) with 7*(z, m(z)) = 0. 

(2) \*{am) dR + A* (—am) dR < 00, for some a > 0. 
(Ag) Assumptions on 9. 

(1) for any y G D^o) the function z E Z ^ {y, 9{z)) G M is measurable; 

(2) for any y G 3^o, (l/, 9{-)) = 0, i?-a.e. implies that y = 0; 
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(3) G 3^0, 3a > 0, J^X{a{y,e))dR< 00. 
Remarks 2.13. Some technical remarks about the assumptions. 

(a) Since 7* is a convex function on M, it is continuous on the interior of its domain. 
Under our assumptions, 7* is (jointly) measurable, and so are 7 and m. Hence, A is 
also measurable. 

(b) As 7* is strictly convex, 7^ is differentiable. 

(c) Assumption (AL) is m G Lx*. It allows to consider Problem (\Pc\l rather than (Pco)- 



If this assumption is not satisfied, our results still hold for (Pco h but their statement 
is a little heavier, see Remark 14.101 -d below, 
(d) Since Xo and 3^o are in separating duality, (Ag) states that the vector space spanned 
by the range of 6 "is essentially" A'q. This is not an effective restriction. 



3. Preliminary results 
The aim of this section is to recall for the convenience of the reader some results of 

3.1. Convex minimization problems under weak constraint qualifications. The 

main results of [T7j are presented. 

Basic diagram. Let Uo be a vector space, Co = Uq algebraic dual space, $ a (—00, +00]- 
valued convex function on Uo and $* its convex conjugate for the duality (Wo, Co) ■ 

$*(£):= sup{(u,£) -$(n)}, ieCo 

Let be another vector space, Xo = y* its algebraic dual space and To : Co ^ Xo a. 
linear operator. We consider the convex minimization problem 

minimize $*(£) subject to To£ eC, i e Co (Vo) 

where C is a convex subset of Xo- 

This will be used later with $ = on the Orlicz space Uo = E\^{Z, R) or Uo = C\^{Z, R). 
It is useful to define the constraint operator To by means of its adjoint T* : ^ £* for 
each e G £„, by {T*y,i)c*^Co = {y,Toi)y,,Xo, G 3^o- 



Hypotheses. Let us give the list of the main hypotheses. 

(i/$) 1- $ : Wo [0, +00] is (t(Uo, £o)-lower semicontinuous, convex and $(0) = 

2- Vm G Uo, 3a; > 0, $(an) < 00 

3- Vm G Wo, m ^ 0, 3t G M, $(tu) > 

{Ht) 1- t;(3;o) cWo 

2- ker T; = {0} 
(He) C n A" is a convex a{X, 3^)-closed subset of X 

The definitions of the vector spaces X and y which appear in the last assumption are 
stated below. For the moment, let us only say that if C is convex and cr{Xo, 3^o)-closed, 
then (Hq) holds. 
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Several primal and dual problems. These variants are expressed below in terms of new 
spaces and functions. Let us first introduce them. 

- The norms | ■ |$ and | • |a. Let $±(m) = max($('u), $(—«)). By (-ff^i) and (-^$2), 
{u G Uo',^±{u) < 1} is a convex absorbing balanced set. Hence its gauge functional 
which is defined for all u E Uo hj := infjo; > 0;$±(m/q;)) < 1} is a seminorm. 
Thanks to hypothesis {H^^), it is a norm. 
Taking (Hti) into account, one can define 

Ao{y):^HT:y),yeyo. (3.1) 

Let A±(y) = max{Ao{y), Ao{—y)). The gauge functional on of the set {y G D^o; A±{y) < 
1} is |y|A infjo; > 0;A±{y/a) < l},y G 3^o- Thanks to (H^) and (Ht), it is a norm 



and 



\y\A = \T*y\^, y G yo- 



The spaces. Let 

U be the | • |$-completion of Uo and let 

jC :— {Uo, I • 1$)' be the topological dual space of {Uo, \ ■ |$). 

Of course, we have {U, | • |$)' = £ C £0 where any £ in U' is identified with its restriction 
to Uo- Similarly, we introduce 

y the I • |A-completion of and 

^ '■— {yo: I • I a)' the topological dual space of (3^o, I • |a)- 

We have (iV, | • |a)' = A" C A",, where any x in y' is identified with its restriction to 3^o- 
We also have to consider the algebraic dual spaces C* and X* of C and X. 
The operators T and T*. Let us denote T the restriction of To to £ C Co- One can 
show that under (if$&r), ToC C X. Hence T : C X. Let us define its adjoint 
T* : X* ^ C* for all cu e X* by: {i,T*uj)c,c* = {Ti,uj)x,X',yi e >C. We have the 
inclusions 3^o C ^ C X*. The adjoint operator T* is the restriction of T* to 3^o- 
The functionals. They are: 



*(C) 


■= sup<,g^{(C,^) - 




C G £* 


Hy) 


:= ^T*y), 




yey 


A{u;) 


:= $(T*cj), 




LjeX* 


Al{x) 


— sup^eyj(l/,a;) 


-K{y)}, 


X e Xo 


A*{x) 


:= supy^y{{y,x) - 


-My)}, 


X e X 



- The optinnization problenns. They are: 

minimize $*(£) subject to T,/ eC, £ e Co {Vo) 

minimize subject to Ti E C, ieC {V) 

maximize inf (y, x) — A(y), y^y {T^) 

maximize inf (x, a;) — A(a;), uj E X* {V) 

Statement of the results. It is assumed that {H^), {H^) and {He) hold. 
Theorem 3.2 (Primal attainment and dual equality). 

(a) The problems {Vo) and {V) are equivalent: they have the same solutions and inf (Pq) = 
inf(P) G [0,00]. 
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(b) We have the dual equalities 

mf{Vo) = mf{V) = snp{V) = sup(P) = inf A*(x) = inf A*(a;) G [0, oo] 

x<=c xecnx 

(c) If in addition {i G Co]Toi G C} fl dom$* 7^ 0, then iVo) is attained in C Moreover, 
any minimizing sequence for iVo) has a {C,U)- cluster points and every such cluster 
point solves (Vo)- 

Theorem 3.3 (Dual attainment and representation. Interior convex constraint). 
Assume that C fl icor (Todom $*) 7^ 0. 

Then, the primal problem (Vo) is attained in C and the extended dual problem {V) is 
attained in X* . Any solution i E C of (Vo) is characterized by the existence of some 
u) G X* such that 

(a) T£gC 

(6) (T*cj, t) < {T*iu, i) for all i e {£ e C;T£ e C} n dom (3.4) 
(c) iEdc^{T*iu) 

Moreover, i E C and Co E X* satisfy Iji3.4\ ) if and only if i solves (Vo) and uj solves {T>). 

The assumption C fl icor (Todom $*) 7^ is equivalent to C fl icordomA* 7^ and the 
representation formula f l3.4l -c) is equivalent to Young's identity 

+ $(T*cj) = {uj, Ti) = A*{x) + A{lj). (3.5) 

Formula ( 13.4j -c) can be made a little more precise by means of the following regularity 
result. 

Theorem 3.6. Any solution uo of {T>) shares the following properties 

(a) UJ is in the a{X*, X)-closure of dom A; 

(b) T*cj is in the a{C* , C)-closure of T* {dom A). 

If in addition the level sets of ^ are \ ■ \<^-bounded, then 

(a') UJ is in y" . More precisely, it is in the a {y" , X)- closure of dom A; 

(b') T*uj is in U" . More precisely, it is in the a {U" , C)- closure o/T*(domA) 

where y" and lA" are the topological bidual spaces of y and lA. This occurs if $, or 
equivalently $*, is an even function. 

3.2. Convex conjugates in a Riesz space. The following results are taken from [HI 
[T6] . For the basic definitions and properties of Riesz spaces, see [TJ Chapter 2]. 
Let U he a. Riesz vector space for the order relation < . Since ?7 is a Riesz space, any 
u E U admits a nonnegative part: m+ := w V 0, and a nonpositive part: U- := (— m) V 0. 
Of course, u = u+ — and as usual, we state: \u\ = m+ + m_. 

Remark 3.7. Recall that there is a natural order on the algebraic dual space E* of a Riesz 
vector space E which is defined by: e* < /* <S=> (e*,e) < (/*,e) for any e G -E with 
e > 0. A linear form e* G E* is said to be relatively bounded if for any / G -E, / > 0, 
we have supg.|g|<j |(e*,e)| < +00. Although E* may not be a Riesz space in general, the 
vector space E^ of all the relatively bounded linear forms on E is always a Riesz space. 
In particular, the elements of E'' admit a decomposition in positive and negative parts 
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Let $ be a [0, oo]-valued function on U which satisfies the following conditions: 

\/u e U, $(n) = - M_) = <l>(?i+) + $(-'«_) (3.8) 

vu,veu, <^^^^^^ ^ ^^^^ ^ ^^^^ [6.y) 

Clearly (ESD implies $(0) = 0, (ESD and (ES]) imply that for any u e U, = + 
> $(0) + $(0) = 0. Therefore, $* is [0, oo]-valued and $*(0) = 0. 
For all u & U, = = $(— The convex conjugates of $+ and 

$_ with respect to {U, U*) are denoted ^\ and $1. Let L be the vector space spanned 
by dom$*. The convex conjugates of $*, and $1 with respect to (L, L*) are denoted 
$, and The space of relatively bounded linear forms on U and L are denoted by 
and L'', whenever L is a Riesz space. 
One writes a± G A± for [a+ G and a_ G AJ\. 

Proposition 3.10. Assume ^3. ^ and ^3.fA) and suppose that L is a Riesz space. 

(a) For allie U*, 

+00 otherwise 

(b) Denoting and L_ the vector subspaces of L spanned by dom$^ and dom$^, we 
have 

<^+(C+|l,) + <^'-(C-|lJ ^/CeL" 

+00 otherwise 
which means that $±(C±) = ^±(C±) ^/C± (^i^d C± match on L±. 

(c) Letie L, C e L* be such that i G ^^^(C). Then, i± G 9l±$^(C±|l±) ^ L±. 

Proof, (a) and (b) are proved at [Mlj Proposition 4.4] under the additional assumption 

that for all M G t/ there exists A > such that $(Am) < +oo. But it can be removed. 

Indeed, if for instance is null, $1 is the convex indicator of {0} whose domain is in 

U'^. The statement about $ is an iteration of this argument. 

The last statement of (b) about C±|l± directly follows from dom$^ C L±. 

For (c), see the proof of P^Bl Proposition 4.5]. □ 

4. Solving (Pc) 



The general assumptions (A) are imposed and we study (Pc)- 



4.1. Several function spaces and cones. To state the extended dual problem (jDcJ) 
below, notation is needed. If A is not an even function, one has to consider 

\+{z,s) = \{z,\s\) , . 

X4z,.s) = \{z,-\s\) 

which are Young functions and the corresponding Orlicz spaces. 

Definitions 4.2. For any relatively bounded linear form ( on L'^^ i.e. C & L'^ , one writes: 

• CeK'l to specify that C±\l' ni' ^ ^a± 

I A_|_ Ao 

• C e ^A* specify that C±|i r^l', ^ L' 

I A_j_ Ao ± 

• C e JsTa to specify that (±1^ ^hl; ^ ^a± 

I A_j_ Ao 

• C^K{, to specify that C±\l,, RnL\ ^ L 



C G Kf^ to specify that C±\li nL{ ^ ^a± 
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where X± are defined at ^4jJ^ o.i^'d C±|l , nL' ^ nT'^o.ns that the restriction of (± to 
L± n is continuous with respect to relative topology generated by the strong topology 
of L± on L± n L\^. 

(1) The sets K'^, K[,, K\, K^, and are defined to he the corresponding subsets of 
L'^^. They are not vector spaces in general hut convex cones with vertex 0. 

(2) The a {K'l ^ K'y) -closure A of a set A is defined as follows: ( G L'^^ is in A if 
C±|L^ nL'^^ in the a{L'l^n L'^^, L'^^H L'^ J -closure of A± = {C±;C ^ A}. Clearly, 

A^ = {C±;CeA}. 

One defines similarly the a{K';^,, Kx*), a{Kx,K'y), a^K^,, Kx*) and a{Kl' , K^)- 
closures. 

(3) Let A he a suhset of Lx^- Its strong closure s-c\A in Kx is the set of all measurahle 
functions u such that u± is in the || ■ \\x±-dosure of A± = {v±;v G A}. 

Let p be a Young function. By Theorem 12. 3[ we have L" = [Lp.R © L^] © L^'. For any 
( E Lp = {Lp*R © Lp)', let us denote the restrictions Ci = C\Lp,R and ^2 = Since, 
{Lp*R)' Lp® Lp,, one sees that any ( G is uniquely decomposed into 

C = Ci + Ci + (2 (4.3) 

with Ci = Ci + CI e L'p,, G Lp, CI G L'p, and C2 G ^p- With our definitions, K'^ = 
[Kx © A''^.] © K^' and the decomposition (14.31) holds for any C ^ K'^ with 

C, = C? + CteKx®Ki,=K',., 
C2 e Kf. 

4.2. The ingredients of the saddle-point method. One applies the abstract results 
of Section 13.11 with 

$(«) = Ix{u) := \{u) dR, ueUo:= A, (4.4) 

This gives U = Lx^ with the Orlicz norm \u\^ = \\u\\x^ and C = = Lx^R © L\^, 
by Theorem 12. 3[ The space y is the completion of 3^o endowed with the norm \y\\ = 
II {Vj ^) lUo- One denotes 3^ = 3^ It is isomorphic to the closure of the subspace {{y,0);y G 



3^0} in Lx^, see assumption (Ag). With some abuse of notation, one still denotes T*y = 
{y, 9) for y G 3^^. Remark that this can be interpreted as a dual bracket between X* and 
Xo since T*y = {y,9) R-a.e. for some y G X*. The topological dual space = y'l is 
identified with L';^^/kerT and its norm is given by = inf £ G : T(£) = x}. 

This last identity is a dual equality as in Theorem 13. 21 -b with $ = where B is the unit 
ball of and C = {x}. 

The assumption (He) that C is cr(A'L, 3^i^)-closed convex is equivalent to 

T^^'C nLl=f]{ie L',^; {{y, e),i)>ay} (4.5) 

y€Y 

for some subset Y G yi and some functions ?/ G F 1-^ G M. For comparison, note that 
if C is only supposed to be convex, C\(y a)eA ^ -^Ao' ((v^^)^^) > with A C 3^ x M is 
the general shape of T~^C. 

4.3. The main result. Let us define 

r*(x) = sup - /^((?/,6l))}, X e Xo 

yeyo 
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which is the convex conj ugate of T{y) = I'j{{y, 9)), y E 3^o- The dual problem (V) associ- 
ated with ( |P(g[ ) and (P^) is 

maximize inf {y,x) - I^{{y,9)), yey (Dq) 
x€Cnx 

The extended dual problem is 

maximize inf x) - /a([T*c^K) + ^S^^,^. ([T^O + ^/^([^Ms), u; ey (Dc) 

3? t O 

where 

• T* : XI — > L'^^ is the extension of T* which is defined at Section 13.1^ 

• D is the cr(i^r^', i^'^)-closure of douiI\ and 

• 3^ is the cone of all uj & XI such that T*u G K'l. 

Clearly, idom/;,.(Ci) = 4om/;,^ (Ci+) + ^dom/;,._ (Ci'-) and ld{C2) = tD+{C2+) + tD_{C2-) where 
D± is the (T{Lf^ n L'l^.Ll^ n L';^J-closure of dom/A±. 

As R is assumed to be a-finite, there exists a measurable partition {Zk)k>i oi Z : \_\f^ = 
Z, such that R{Zk) < oo for each k > 1. 

Theorem 4.6. Suppose that 

(1) the assumptions (A) are satisfied; 

(2) for each k > 1, L\^{Zk, R\Zk) dense in Lx^{Zk, R\zJ and L\_{Zk, R\z^) with 
respect to the topologies associated with \\ ■ \\\^ and \\ ■ ||a_; 

(3) C satisfies ( f^^ with {y, 9) e L^^ for all y eY. 

Then: 

(a) The dual equality for (Pc) is 

inf(Pc) = inf T*{x) = sup(Dc) = suppc) G [0, oo]. 



(h) If C n domF* ^ % or equivalently C fl T^dom/ ^ 0, then (Pc) admits solutions in 
L'^^, any minimizing sequence admits a {L'^^, L\^)- cluster points and every such point is 
a solution to (Pc). 
Suppose that in addition we have 

CnicordomrV0 (4-7) 
or equivalently C fl icor (TodomJ) ^ 0. Then: 

(c) Let us denote x = T*L There exists Q such that 

(a) xeCndomF* 

\h) {u),x)xixl < {^^,x)xi,Xl,^x G CndomF* (4.8) 
(c) iey,i[T*uri)R + DH[T*i^h) 
where 

D^{r]) = {k e Ll;\fh e Lx^,r] + h e D ^ {h,k) < 0} 
is the outer normal cone of D at r]. 

T*uj is in the cr{K", K'^)-closure o/T*(domA) and there exists some uj G X* such that 

[T*u:r, = {Co,9{-))x;,x^ 
is a measurable function in the strong closure c»/T*(domA) in Kx- 

Furthermore, I G and a) G 3^ satisfy \J^-S^ if and only if i solves (Pc) ond uj solves 

(d) Of course, Q^c) implies x = J^9'y'{{il!,9)) dR+ {9,i'^). Moreover, 



MINIMIZATION OF ENTROPY FUNCTIONALS 13 

1. X minimizes T* on C, 

2. I{i) = r*{x) = J^Y dR + s^p{{u,t)]u edom I^} <oo and 

3. + j^Y{Q,e))dR = j^{Co,d)dt+{[T*uj]^,t)Ki^,^Ki. 

Proposition 4.9. For the assumption (2) of Theorem \4.6\ to be satisfied, it is enough that 
one of these conditions holds 

(i) A is even or more generally < liminf(_,oo ^if) ^ l^sup^^^ X^(^) < +oo; 

(ii) limt^oo x^(^) = +00 and A_ satisfies the condition l{2.6\) . 

Proof. It is enough to work with a bounded measure R. 

Condition (i) is equivalent to Lx^ = Lx = Lx^ and the result follows immediately. 
Condition (ii) says that A+ = A^ and Lx_ = Ex_ ■ As 7* is assumed to be strictly convex, 
zero is in the interior of domA and Lx^ contains the space B of all bounded measurable 
functions. But B is dense in Ex_ and the result follows. □ 

Remarks 4.10. General remarks about Theorem 14.61 

(a) The assumption (3) is equivalent to C is cr(A'L, 3^i^)-closed convex. 

(b) The dual equality with C = {x} gives for all x E Xo 

rix) = mf{l{iy,ieL\^,{0J) = ^}- 

(c) Note that lj does not necessarily belong to 3^o- Therefore, the Young equahty {lj, x) = 
r*{x) + is meaningless. Nevertheless, there exists a natural extension F of F 
such that {x,uj) = F*(x) + T{lj) holds, see (13.51) . This gives the statement (d-3). 

(d) Removing the assumption (A^,): m G Laj, one can still consider the minimization 
problem 

minimize subject to {6, i - mR) e Co, £ e mR + L'^ (PcJ 



instead of (Pc)- The transcription of Theorem 14.61 is as follows. Denote 

A*{x) = sup \{y,x) - \{{y,e))dR\, x e Xo 
y&yo I Jz ) 

and replace respectively (Pc), C, T*, x and 7 by (Pco), C'o, A*, x and A where 
X = {0, £ — mR) is well-defined. 

The s tatem ent (b) must be replaced by the following one: // Co n domA* 7^ 0, 
then (Pco) admits solutions in mR + L'^^^, any minimizing sequence (£n)n>i is such 
that {in — ^R)n>i admits cluster points I — mR in L';^, with respect to the topology 
a{L'^^, LxJ and i is a solution of (Pco)- 

Proof of Theorem \4.6[ It is an application of Theorems 13.21 and 13. 3[ We use the notation 
and framework of Section 13. 1[ 

With (113D and Theorem [SJl-a, dom<^* C C = L'^^. For all £ G L'^^, 

$*(£) $;(£+) + $!(£_) 

i inf{/,;(r) + il^^x^in. keL',^:k> 0, fc,^,^ = 

+ inf{/,,(A;'^) + i:,^xAn,k G L',_ : k > 0, k\L,^ = i_} 

Equality (a) comes from Proposition 13. lOI -a and equality (b) is a dual equality of the type 
of Theorem 13.21 -b applied with 

/;(A;) = V(n + ^domp(fc^) keL'^ (4.11) 
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which holds for any Young function p. This identity is proved by Fougeres, Giner, Kozek 
and Rockafellar [HI [T31 [21] under the assumptions (A/j) and (A^.). The function I* is 
strongly continuous on icordom/* C L^, see [HI Lemma 2.1]. Hence, under the assump- 
tion (2), we obtain that 

/(£) = mR) , £ G L';,^ (4. 12) 

taking advantage of the direct sum £ = ®k^\z,, acting on m = {u\z^.)k>i which lead to the 
nonnegative series <!>(«) = ®k^{u\Zy) and $*(^) = Y.k^*^^\Zk)- 

• Reduction to m = 0. We have seen at (12.101) that the transformation Q --^ i = Q — mR 



corresponds to the transformations 7 -w A and ( |PcD -w (12.101) . This still works with ( Pp ) 



and one can assume from now on without loss of generality that m = and 7 = A. 

The assumption (A^,) will not be used during the rest of the proof. This allows Remark 

KMd. 

• Verification of (H^) and (Ht). Suppose that W = {z E Z;\{z,s) = 0,Ws G M} is 
such that R{W) > 0. Then, any £ such that {ulw,£) > for some u G Lx^ satisfies 

= +00. Therefore, one can remove W from Z without loss of generality. Once, 
this is done, the hypothesis (-ff$) is satisfied under the assumption (A;^*). The hypothesis 
(Hti) is (A^) while {Ht2) is (A^). 

• The computation of $ in the case where A is even. Since $ is even. Theorem 13.61 tells 
us that dom<l> is included in the ct(L^, L';(^)-closure of dom$. Thanks to (14. lip and the 
decomposition (14. 3p . the extension $ is given for each ( G L'^ by 

^(C) = (/aO*(Ci,C2) 

sup {(Cl, fR) + (C2, k) - h4fR) - ^loml.m 

= /^(Cl) + Cm/.(C2) 
= /a(Ci)+'d(C2) 

where D is the (t(L^', L^)-closure of domix and we dropped the restrictions (\l for sim- 
plicity. 

• Extension to the case where A is not even. By Proposition 13.101 -b. we have $(C) = 
*^+(C+iL' nL' ) + *^-(C-|L' nL' ) if C ^ -^A +00 otherwise. It follows that 

I A_|_ Ao I A_ Ao ^ 

HO = m) = hio + Cm/,,(cr) + ^d(C2) (4.13) 

if C € K'x and +00 otherwise. In particular, we have 

A(y) = lxi{y,0)), yey 

1 +00 otherwise ' ^' 



This provides us with the dual problems dDcP and ( |Dc[ ). 



Proof of (a) and (h). Apply Theorem 13. 2[ □ 
Let us go on with the proof of (c). By Theorem 13.31 ( PglDc ) admits a solution in 



L'y^ X 3^ and (£, a;) G L^^ x y solves dPclDcl ) if and only if 



(a) xGCndomF* 

{h) < Vx G C n domF* (4.14) 

(c) i G dL' $(T*cu) 
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where x = Ti is defined in the weak sense with respect to the duahty (3^l5'^l)- Since 
domF* C Xl, the above dual brackets are meaningfuL 

• The computation of Ol' $(C)- Let us first assume that A is even. For all u G Lx, = 

U2 = u and ul = 0. This gives + -$(C) = hiC^ + ui) - hiCi) + iD{C2 + U2) - iDiC2) 
where ui = u and U2 = u act respectively on Lx*R and L\. This direct sum structure 
leads us to 

dL'fi{0 = dL,,Rh{0 + dLMC2)- (4.15) 

which again is the direct sum of the absolutely continuous and singular components of 
di'^^iO- Differentiating in the directions oiU = La, one obtains di^^RlxiCi) = {^'iCi)R}- 
The computation of 9l^6d(C2) is standard: dLiiuiCi) = -D^(C2) is the outer normal cone 
of D at C2- 

Now, consider a general A. By Proposition IS.lOl a. G di' nL' ^+i\T*id]^) and G 

A_|_ Ao 

<9l'^ nL'^^^-i[T*iu]-). Therefore, (14.151) becomes 

• Representation of [T*uj]i. One still has to prove that 

[T*u]Uz) = {9iz),u) (4.16) 
for R-a.e. z ^ Z and some linear form a) on Xo- 

If W- := {z G Z] X{z, s) = 0, Vs < 0} satisfies R{W-) > 0, domJ is a set of linear forms 
which are nonnegative on W and 7^(s) = for all s < 0, 2; G W. Hence, one can take any 
function for the restriction to of [T*a;]^_ without modifying (I4.14p -c. As a symmetric 
remark holds for = {z & Z; X{z, s) = 0, Vs > 0}, it remains to consider the situation 
where for i?-a.e. z, there are S-{z) < < 5^(2;) such that X{z,s±{z)) > 0. This implies 
that lims^ioo A(z, s)/s > 0. 

By Theorem 13.31 T*u! is in the cr{K'l, K'^)-closuTe of T*(domA). Therefore, [T*ci}]^ is in 
the a{Kx, K'^^)-closme of T*(domA). As T*(domA) is convex, this closure is its strong 
closure in Kx- Since there exists a finite measurable function c{z) such that < c{z) < 
lim^^oo A(2;, s)/s, one can consider the nontrivial Young function p{z,s) = c{z)\s\ and 
the corresponding Orlicz spaces Lp and L'^ = Lp*. If is a bounded measure, we have 
Lx^ C Lp and Lp* C Lx*, so that [T*c<}]^ is in the strong closure of T*(domA) in Lp. 
As a consequence, [T*u;]" is the pointwise limit of a sequence (T*yn)n>i with ?/„ G 3^. As 
T*yn{z) = {yn,0{z)), we see that [T*u]i{z) = {9{z),lj) for some linear form oj on Xo. If R 
is unbounded, it is still assumed to be cr-finite: there exists a sequence (Zk) of measurable 
subsets of Z such that Ufc^^ = ^ and R{Zk) < 00 for each k. Hence, for each k and all 
z G Zk, {T*u)°'{z) = {6{z),uj^) for some linear form on X^, from which (I4.16P follows. 
•Proof of (c). It follows from the previous considerations and Theorem 13.31 
•Proof of (d). Statement (d)-l follows from Theorem 13. 2[ Statement (d)-2 is immediately 
deduced from (c). Finally, (d)-3 is (13. 5p . □ 



5. Solving (Pc) 

The general assumptions (A) are imposed and we study (jP^]) under the additional good 
constraint assumption (Ag) which imposes that the convex set C is such that 



T-^cnLx.R 



fl IfReLxtR; [ {y,9)fdR>ay 



(5.1) 



16 CHRISTIAN LEONARD 

for some subset Y G X* such that (y, 9) G for all ?/ G F and some function y eY ^ 
ay G M. 

The dual problem {V) associated with ( |PcD is ( |DcD and the extended dual problem is 

maximize inf (uj^x) — I^({uj,9)), u & y (Dc) 

where y is the convex cone of all linear forms uj on Xo which are such that 

- the function {uj,9{-))x*,Xo is measurable; 

- X{t{iu, 9{-))) dR < oo for some t > 0; 

- {u, 9{-)) is in the a{Kx, Kx*)-closuTe of {{y, 9); y G 3^o}. 

Theorem 5.2. Suppose that 



(1) the assumptions (A) and (A^q) are satisfied; 

(2) for R- almost every z E lim^^-i-oo 7*(^)/^ = +oo; 

(3) C satisfies [5l\) with {y, 9) G E^^ for all y eY. 

Then: 

(a) The dual equality for (Pc) is 

inf(Pc) = sup(Dc;) = sup(Dc) = inf r*(a;) G [0, oo]. 

(b) //CndomF* ^ ^ or equivalently CflTodomJ ^ 0, then ( |PgD admits a unique solution 
Q in Lx*R and any minimizing sequence {Qn)n>i converges to Q with respect to the 
topology cr{Lx'^.R, Ex^). 

Suppose that in addition C fl icordomP* ^ ^ or equivalently C fl icor (TodomJ) ^ 0. 

(c) Let us define x = 9 dQ in the weak sense with respect to the duality (3^o, Xo). There 
exists uj E y such that 

(a) sGCndomP* 

lb) {Cj,x)xs.Xo< {^,x)x*,Xo,^x eCndomT* (5.3) 
(c) Q{dz)=y,{{u,9{z)))R{dz). 



Furthermore, Q G Lx^R and uj E y satisfy ( 15.31) if and only if Q solves (Pc) one? uj 
solves (Dc). 

(d) Of course, l{5.3\ -c) implies x = f^9'y'{{ijj,9)) dR in the weak sense. Moreover, 

1. X minimizes V* on C, 

2. I{Q) = r*{x) = 7* o 7'((cu, 9))dR<oo and 

3. I{Q) + J2-f{{u,9))dR = J^{u,9)dQ. 

Proof. It is a corollary of the proof of Theorem 14.61 One applies the abstract results of 
Section 13.1 1 with 

^{u) = Ix{u) := j ^\{u) dR, ueUo:=£xo (5.4) 

This gives U = Ex^ with the Orlicz norm = ||m||ao and C = Lx*R. The space y := yE 
is the completion of 3^o endowed with the norm \y\\ = \\{y,9)\\x^- It is isomorphic to the 
closure of the subspace {{y,9);y G D^o} in Ex^, see assumption (Ag). The topological 
dual space Xe = y'E is identified with Lx*R/keYT and its norm is given by \x\*/^ = 
inf{||/|Us;/GL,;:T(/i?)=x}. 

The assumption (3) is: C is a convex a{XE, 3^£;)-closed set. 

As in the proof of Theorem 14. 6[ one reduces to the case where m = without loss of 
generality. 
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The assumption (2) implies that A is a finite function. It follows that E'^^^ = Lx*, the 
convex conjugate $* of $ with respect to the duality {Ex^,Lxi) is 

(see [20]) and the corresponding extended function $ is 

if ( is in Kx-R © and +oo otherwise. 

With these correspondences, the proof of the theorem is an immediate translation of the 
proof of Theorem 14. 6[ □ 

Remarks 5.5. 

(a) The assumption (2) implies that A is a finite function. Note that otherwise one would 
get Ex, = {0}. 

(b) As in Remark l4.10l -d. removing the assumption (A^*): m G La;, one can still consider 
the minimization problem (P^^ ) instead of ( |Pc[ ). The transcription of Theorem 15.21 is 



as follows. Replace respectively ( |PcD , C,T*, x and 7 by (Pco ); Co, A*, x and A where 



X = j^,^ d-iQ — 1T1R) is well-defined. 

The statement (b) must be replaced by the following one: If Co H domA* 7^ 0, then 
[Pco ) admits a unique solution Q in mR+Lx'R and any minimizing sequence {Qn)n>i 



is such that {Qn — mR)n>i converges in Lx*R to Q — niR with respect to the topology 
a{LxiR,Ex,). 

Seeing Theorem 15.21 as a direct corollary of Theorem 14.61 would have been possi- 
ble since Proposition 12.51 insures that T~^C fl L'^^ = flyey ^ -^Ao? (^5^) — '^y] = 



f^ygy {/i? G -Laj-R; /^(?/, 6*)/ (ii? > Oj^} whenever (Ag) holds. But, the drawback is 



that the unnecessary assumption (2) of Theorem 14.61 has to be kept. 

6. Examples 

Standard examples of entropy minimization problems are presented. 

6.1. Some examples of entropies. The entropies defined below occur naturally in 
statistical physics, probability theory, mathematical statistics and information theory. 

Boltzmann entropy. The Boltzmann entropy with respect to the positive measure R is de- 
fined by Hb(Q\R) = \ ■^^^^^ ^[^ <Q<R fQj, ^^^^ Q e Mz.lt corresponds 

1^+00, otherwise 

{tlogt ift>0 
if t = . But this 7* takes negative values and is ruled out by our 
+00 if t < 

assumptions. A way to circumvent this problem is to consider the variant below. 

A variant of the Boltzmann entropy. Let m : Z — (0, 00) be a positive measurable 
function. Considering 

7*(t) = tlogt - [1 + log m{z)]t + m{z), t > 0, 

one sees that it is nonnegative and that 7*(t) = if and only if t = m{z). Hence 7* enters 
the framework of this paper and 

A^(s) = m{z) [e" - s - 1], seR. (6.1) 
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It is easily seen that 

HBiQ\R) = IriQ) + I {l + \ogm)dQ- [ mdR 

J z J z 

which is meaningful if Q integrates 1 + logm where m G L^{R). 

As an application, let R be the Lebesgue measure on Z = and minimize Hb{Q\R) 
on the set C = {Q E Pz] l-spQlc^-z) = E} fl Co- Taking m{z) = e"'^' , one is led to 
minimizing J^* on C. 

A special case. It is defined by 

HiQiR) ^ I L [Slog (S) - S + 1] rffl .f < Q ^ 

+00, otherwise 

{tlogt-t+1 ift>0 
1 if t = , m{z) = 1 and Xz{s) = — s — 1, 

+00 if t < 

s G M for all z E Z. Note that H{Q\R) < oo implies that Q is nonnegative. 

Relative entropy. The reference measure R is assumed to be a probability measure and 
one denotes Pz the set of all probability measures on Z. The relative entropy of Q G Mz 
with respect to i? G Pz is the following variant of the Boltzmann entropy: 

I{Q\R) = I if g ^ and Q G 

+00 otherwise 

It is (16.21) with the additional constraint that Q{Z) = 1 : 

I{Q\R)=H{Q\R) + i{Q^z)=i} 

When minimizing the Boltzmann entropy Q ^ Hb{Q\R) on a constraint set which is 
included in Pz, we have for all P,Q G Pz, 

Hb{Q\R) = I{Q\P) + ^ log (^^^ dQ 
which is meaningful for each Q G Pz which integrates 

Reverse relative entropy. The reference measure R is assumed to be a probability measure. 
The reverse relative entropy is 

0,M,»(fl« .10,00]. 

1 +00 otherwise ' 

It corresponds to 7*(t) = | _|_(!|^^^ ^ "'^ if t < ' "^("^^ ~ 

A.(s) = | -l^S^^-^)-^ , (6.3) 

^ ^ I +00 if s > 1 ' ^ ^ 

for all z E Z, with the additional constraint that Q{Z) = 1. 

6.2. Some examples of constraints. Let us consider two standard constraints which 
are the moment constraints and the marginal constraints. 
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Moment constraints. Let 9 — {9k)i<k<K be a measurable function from Z to Xo — K"^- 
The moment constraint is specified by the operator 

Toi^ f ede=( [ Okde] gm^, 

Jz \Jz J \<k<K 

which is defined for each i e Mz which integrates all the real valued measurable functions 
Qk- The adjoint operator is 

Kv{^) = Yl yM^'^: y = {yi: . . . , y/f ) e M^, z e Z. 

l<k<K 

Marginal constraints. Let Z = Ax B he a product space, Mj^b be the space of all bounded 
signed measures on AxB and Uab be the space of all measurable bounded functions u 
on AxB. Denote — x B) and Ib — ^{-^ x •) the marginal measures of £ e Mab- 
The constraint of prescribed marginal measures is specified by 

/ ed^= {Ia^Ib) e Max Mb, Mab 

J AxB 

where Ma and Mb are the spaces of all bounded signed measures on A and B. The 
function 9 which gives the marginal constraint is 

9{a, h) = {6a, Sb), a e A,b e B 

where Sa is the Dirac measure at a. Indeed, (^tIj^b) = Jaxb^^"-'^''') 

More precisely, let Ua, Ub he the spaces of measurable functions on A and B and take 
yo = UaX Ub and = U\x Ub- Then, 6* is a measurable function from Z — AxB to 
Xo — U*aXUb- It is easy to see that the adjoint of the marginal operator 

To^ = (^A, ^b) eU*AxU*B, eeCo^ U*AB 
where (/, Ia) '■= {f ® 1, ^) and {g, is) '■= {l^ g,i) for all f E Ua and g E Ub, is given by 

T:{f,g) = f®geUAB, feUA,geUB (6.4) 
where / © g{a, h) := f{a) + g{b), a e A,b e B. 
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